This is a simplified version of the program used for the Kaggle competition “House Prices”
12/2016



Data descriptions

Ask a home buyer to describe their dream house, and they probably won’t begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition’s dataset proves that much more influences price negotiations than the number of bedrooms or a white-picket fence.

With 79 explanatory variables describing (almost) every aspect of residential homes in Ames, Iowa, this competition challenges you to predict the final price of each home.

The potential for creative feature engineering provides a rich opportunity for fun and learning. This dataset lends itself to advanced regression techniques like random forests and gradient boosting with the popular XGBoost library. We encourage Kagglers to create benchmark code and tutorials on Kernels for community learning. Top kernels will be awarded swag prizes at the competition close.

Acknowledgments

The Ames Housing dataset was compiled by Dean De Cock for use in data science education. It’s an incredible alternative for data scientists looking for a modernized and expanded version of the often cited Boston Housing dataset.




Find the data and the code on Kaggle or on Github



File descriptions

train.csv - the training set
test.csv - the test set
data_description.txt - full description of each column, originally prepared by Dean De Cock but lightly edited to match the column names used here
sample_submission.csv - a benchmark submission from a linear regression on year and month of sale, lot square footage, and number of bedrooms




Find the data and the code on Kaggle or on Github



Variables

SalePrice - the property’s sale price in dollars. This is the target variable that you’re trying to predict.
MSSubClass: The building class
MSZoning: The general zoning classification
LotFrontage: Linear feet of street connected to property
LotArea: Lot size in square feet
Street: Type of road access
Alley: Type of alley access
LotShape: General shape of property
LandContour: Flatness of the property Utilities: Type of utilities available
LotConfig: Lot configuration
LandSlope: Slope of property
Neighborhood: Physical locations within Ames city limits
Condition1: Proximity to main road or railroad
Condition2: Proximity to main road or railroad (if a second is present)
BldgType: Type of dwelling
HouseStyle: Style of dwelling
OverallQual: Overall material and finish quality
OverallCond: Overall condition rating
YearBuilt: Original construction date
YearRemodAdd: Remodel date
RoofStyle: Type of roof
RoofMatl: Roof material
Exterior1st: Exterior covering on house
Exterior2nd: Exterior covering on house (if more than one material)
MasVnrType: Masonry veneer type
MasVnrArea: Masonry veneer area in square feet
ExterQual: Exterior material quality
ExterCond: Present condition of the material on the exterior
Foundation: Type of foundation
BsmtQual: Height of the basement
BsmtCond: General condition of the basement
BsmtExposure: Walkout or garden level basement walls
BsmtFinType1: Quality of basement finished area
BsmtFinSF1: Type 1 finished square feet
BsmtFinType2: Quality of second finishedarea (if present)
BsmtFinSF2: Type 2 finished square feet
BsmtUnfSF: Unfinished square feet of basement area
TotalBsmtSF: Total square feet of basement area
Heating: Type of heating
HeatingQC: Heating quality and condition
CentralAir: Central air conditioning
Electrical: Electrical system
1stFlrSF: First Floor square feet
2ndFlrSF: Second floor square feet
LowQualFinSF: Low quality finished square feet (all floors)
GrLivArea: Above grade (ground) living area square feet
BsmtFullBath: Basement full bathrooms
BsmtHalfBath: Basement half bathrooms
FullBath: Full bathrooms above grade
HalfBath: Half baths above grade
Bedroom: Number of bedrooms above basement level
Kitchen: Number of kitchens
KitchenQual: Kitchen quality
TotRmsAbvGrd: Total rooms above grade (does not include bathrooms)
Functional: Home functionality rating
Fireplaces: Number of fireplaces
FireplaceQu: Fireplace quality
GarageType: Garage location
GarageYrBlt: Year garage was built
GarageFinish: Interior finish of the garage
GarageCars: Size of garage in car capacity
GarageArea: Size of garage in square feet
GarageQual: Garage quality
GarageCond: Garage condition
PavedDrive: Paved driveway
WoodDeckSF: Wood deck area in square feet
OpenPorchSF: Open porch area in square feet
EnclosedPorch: Enclosed porch area in square feet
3SsnPorch: Three season porch area in square feet
ScreenPorch: Screen porch area in square feet
PoolArea: Pool area in square feet
PoolQC: Pool quality
Fence: Fence quality
MiscFeature: Miscellaneous feature not covered in other categories
MiscVal: $Value of miscellaneous feature
MoSold: Month Sold
YrSold: Year Sold
SaleType: Type of sale
SaleCondition: Condition of sale




Find the data and the code on Kaggle or on Github



About Kaggle

In 2010, Kaggle was founded as a platform for predictive modelling and analytics competitions on which companies and researchers post their data and statisticians and data miners from all over the world compete to produce the best models.

This crowdsourcing approach relies on the fact that there are countless strategies that can be applied to any predictive modelling task and it is impossible to know at the outset which technique or analyst will be most effective. Kaggle also hosts recruiting competitions in which data scientists compete for a chance to interview at leading data science companies like Facebook, Winton Capital, and Walmart.




Find the data and the code on Kaggle or on Github



Libraries


# For data manipulation and tidying
library(MASS)
library(tidyr)
library(plyr)
library(dplyr)
library(broom)
library(data.table)
library(testthat)
library(gridExtra)

# For data visualizations
library(ggplot2)
library(plotly)
library(DT)
library(corrplot)
library(GGally)
library(Boruta)
library(pROC)
library(VIM)
library(mice)

# For modeling and predictions
library(mlbench)
library(caret)
library(glmnet)
library(ranger)
library(clValid)
library(e1071)
library(xgboost)



Data Exploration


Data

train <- read.csv("train.csv", header = TRUE, sep = ",", stringsAsFactors = FALSE)
data.test <- read.csv("test.csv", header = TRUE, sep = ",", stringsAsFactors = FALSE)
datatable(head(train, n=20),options = list(scrollX = TRUE))



Summary

summary(train)
##        Id           MSSubClass      MSZoning          LotFrontage    
##  Min.   :   1.0   Min.   : 20.0   Length:1460        Min.   : 21.00  
##  1st Qu.: 365.8   1st Qu.: 20.0   Class :character   1st Qu.: 59.00  
##  Median : 730.5   Median : 50.0   Mode  :character   Median : 69.00  
##  Mean   : 730.5   Mean   : 56.9                      Mean   : 70.05  
##  3rd Qu.:1095.2   3rd Qu.: 70.0                      3rd Qu.: 80.00  
##  Max.   :1460.0   Max.   :190.0                      Max.   :313.00  
##                                                      NA's   :259     
##     LotArea          Street             Alley             LotShape        
##  Min.   :  1300   Length:1460        Length:1460        Length:1460       
##  1st Qu.:  7554   Class :character   Class :character   Class :character  
##  Median :  9478   Mode  :character   Mode  :character   Mode  :character  
##  Mean   : 10517                                                           
##  3rd Qu.: 11602                                                           
##  Max.   :215245                                                           
##                                                                           
##  LandContour         Utilities          LotConfig        
##  Length:1460        Length:1460        Length:1460       
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##                                                          
##   LandSlope         Neighborhood        Condition1       
##  Length:1460        Length:1460        Length:1460       
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##                                                          
##   Condition2          BldgType          HouseStyle         OverallQual    
##  Length:1460        Length:1460        Length:1460        Min.   : 1.000  
##  Class :character   Class :character   Class :character   1st Qu.: 5.000  
##  Mode  :character   Mode  :character   Mode  :character   Median : 6.000  
##                                                           Mean   : 6.099  
##                                                           3rd Qu.: 7.000  
##                                                           Max.   :10.000  
##                                                                           
##   OverallCond      YearBuilt     YearRemodAdd   RoofStyle        
##  Min.   :1.000   Min.   :1872   Min.   :1950   Length:1460       
##  1st Qu.:5.000   1st Qu.:1954   1st Qu.:1967   Class :character  
##  Median :5.000   Median :1973   Median :1994   Mode  :character  
##  Mean   :5.575   Mean   :1971   Mean   :1985                     
##  3rd Qu.:6.000   3rd Qu.:2000   3rd Qu.:2004                     
##  Max.   :9.000   Max.   :2010   Max.   :2010                     
##                                                                  
##    RoofMatl         Exterior1st        Exterior2nd       
##  Length:1460        Length:1460        Length:1460       
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##                                                          
##   MasVnrType          MasVnrArea      ExterQual          ExterCond        
##  Length:1460        Min.   :   0.0   Length:1460        Length:1460       
##  Class :character   1st Qu.:   0.0   Class :character   Class :character  
##  Mode  :character   Median :   0.0   Mode  :character   Mode  :character  
##                     Mean   : 103.7                                        
##                     3rd Qu.: 166.0                                        
##                     Max.   :1600.0                                        
##                     NA's   :8                                             
##   Foundation          BsmtQual           BsmtCond        
##  Length:1460        Length:1460        Length:1460       
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##                                                          
##  BsmtExposure       BsmtFinType1         BsmtFinSF1     BsmtFinType2      
##  Length:1460        Length:1460        Min.   :   0.0   Length:1460       
##  Class :character   Class :character   1st Qu.:   0.0   Class :character  
##  Mode  :character   Mode  :character   Median : 383.5   Mode  :character  
##                                        Mean   : 443.6                     
##                                        3rd Qu.: 712.2                     
##                                        Max.   :5644.0                     
##                                                                           
##    BsmtFinSF2        BsmtUnfSF       TotalBsmtSF       Heating         
##  Min.   :   0.00   Min.   :   0.0   Min.   :   0.0   Length:1460       
##  1st Qu.:   0.00   1st Qu.: 223.0   1st Qu.: 795.8   Class :character  
##  Median :   0.00   Median : 477.5   Median : 991.5   Mode  :character  
##  Mean   :  46.55   Mean   : 567.2   Mean   :1057.4                     
##  3rd Qu.:   0.00   3rd Qu.: 808.0   3rd Qu.:1298.2                     
##  Max.   :1474.00   Max.   :2336.0   Max.   :6110.0                     
##                                                                        
##   HeatingQC          CentralAir         Electrical          X1stFlrSF   
##  Length:1460        Length:1460        Length:1460        Min.   : 334  
##  Class :character   Class :character   Class :character   1st Qu.: 882  
##  Mode  :character   Mode  :character   Mode  :character   Median :1087  
##                                                           Mean   :1163  
##                                                           3rd Qu.:1391  
##                                                           Max.   :4692  
##                                                                         
##    X2ndFlrSF     LowQualFinSF       GrLivArea     BsmtFullBath   
##  Min.   :   0   Min.   :  0.000   Min.   : 334   Min.   :0.0000  
##  1st Qu.:   0   1st Qu.:  0.000   1st Qu.:1130   1st Qu.:0.0000  
##  Median :   0   Median :  0.000   Median :1464   Median :0.0000  
##  Mean   : 347   Mean   :  5.845   Mean   :1515   Mean   :0.4253  
##  3rd Qu.: 728   3rd Qu.:  0.000   3rd Qu.:1777   3rd Qu.:1.0000  
##  Max.   :2065   Max.   :572.000   Max.   :5642   Max.   :3.0000  
##                                                                  
##   BsmtHalfBath        FullBath        HalfBath       BedroomAbvGr  
##  Min.   :0.00000   Min.   :0.000   Min.   :0.0000   Min.   :0.000  
##  1st Qu.:0.00000   1st Qu.:1.000   1st Qu.:0.0000   1st Qu.:2.000  
##  Median :0.00000   Median :2.000   Median :0.0000   Median :3.000  
##  Mean   :0.05753   Mean   :1.565   Mean   :0.3829   Mean   :2.866  
##  3rd Qu.:0.00000   3rd Qu.:2.000   3rd Qu.:1.0000   3rd Qu.:3.000  
##  Max.   :2.00000   Max.   :3.000   Max.   :2.0000   Max.   :8.000  
##                                                                    
##   KitchenAbvGr   KitchenQual         TotRmsAbvGrd     Functional       
##  Min.   :0.000   Length:1460        Min.   : 2.000   Length:1460       
##  1st Qu.:1.000   Class :character   1st Qu.: 5.000   Class :character  
##  Median :1.000   Mode  :character   Median : 6.000   Mode  :character  
##  Mean   :1.047                      Mean   : 6.518                     
##  3rd Qu.:1.000                      3rd Qu.: 7.000                     
##  Max.   :3.000                      Max.   :14.000                     
##                                                                        
##    Fireplaces    FireplaceQu         GarageType         GarageYrBlt  
##  Min.   :0.000   Length:1460        Length:1460        Min.   :1900  
##  1st Qu.:0.000   Class :character   Class :character   1st Qu.:1961  
##  Median :1.000   Mode  :character   Mode  :character   Median :1980  
##  Mean   :0.613                                         Mean   :1979  
##  3rd Qu.:1.000                                         3rd Qu.:2002  
##  Max.   :3.000                                         Max.   :2010  
##                                                        NA's   :81    
##  GarageFinish         GarageCars      GarageArea      GarageQual       
##  Length:1460        Min.   :0.000   Min.   :   0.0   Length:1460       
##  Class :character   1st Qu.:1.000   1st Qu.: 334.5   Class :character  
##  Mode  :character   Median :2.000   Median : 480.0   Mode  :character  
##                     Mean   :1.767   Mean   : 473.0                     
##                     3rd Qu.:2.000   3rd Qu.: 576.0                     
##                     Max.   :4.000   Max.   :1418.0                     
##                                                                        
##   GarageCond         PavedDrive          WoodDeckSF      OpenPorchSF    
##  Length:1460        Length:1460        Min.   :  0.00   Min.   :  0.00  
##  Class :character   Class :character   1st Qu.:  0.00   1st Qu.:  0.00  
##  Mode  :character   Mode  :character   Median :  0.00   Median : 25.00  
##                                        Mean   : 94.24   Mean   : 46.66  
##                                        3rd Qu.:168.00   3rd Qu.: 68.00  
##                                        Max.   :857.00   Max.   :547.00  
##                                                                         
##  EnclosedPorch      X3SsnPorch      ScreenPorch        PoolArea      
##  Min.   :  0.00   Min.   :  0.00   Min.   :  0.00   Min.   :  0.000  
##  1st Qu.:  0.00   1st Qu.:  0.00   1st Qu.:  0.00   1st Qu.:  0.000  
##  Median :  0.00   Median :  0.00   Median :  0.00   Median :  0.000  
##  Mean   : 21.95   Mean   :  3.41   Mean   : 15.06   Mean   :  2.759  
##  3rd Qu.:  0.00   3rd Qu.:  0.00   3rd Qu.:  0.00   3rd Qu.:  0.000  
##  Max.   :552.00   Max.   :508.00   Max.   :480.00   Max.   :738.000  
##                                                                      
##     PoolQC             Fence           MiscFeature       
##  Length:1460        Length:1460        Length:1460       
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##                                                          
##     MiscVal             MoSold           YrSold       SaleType        
##  Min.   :    0.00   Min.   : 1.000   Min.   :2006   Length:1460       
##  1st Qu.:    0.00   1st Qu.: 5.000   1st Qu.:2007   Class :character  
##  Median :    0.00   Median : 6.000   Median :2008   Mode  :character  
##  Mean   :   43.49   Mean   : 6.322   Mean   :2008                     
##  3rd Qu.:    0.00   3rd Qu.: 8.000   3rd Qu.:2009                     
##  Max.   :15500.00   Max.   :12.000   Max.   :2010                     
##                                                                       
##  SaleCondition        SalePrice     
##  Length:1460        Min.   : 34900  
##  Class :character   1st Qu.:129975  
##  Mode  :character   Median :163000  
##                     Mean   :180921  
##                     3rd Qu.:214000  
##                     Max.   :755000  
## 



Visualisation

Thanks to laurae2 for this code for plotting all data using tabplots. The objective is to find out some of the good features visually. As Laurae2 say: you can think of it as the vertical as the “sort by SalePrice”:

invisible(library(tabplot))
invisible(library(data.table))

columns <- c("numeric",
             rep("character", 2),
             rep("numeric", 2),
             rep("character", 12),
             rep("numeric", 4),
             rep("character", 5),
             "numeric",
             rep("character", 7),
             "numeric",
             "character",
             rep("numeric", 3),
             rep("character", 4),
             rep("numeric", 10),
             "character",
             "numeric",
             "character",
             "numeric",
             rep("character", 2),
             "numeric",
             "character",
             rep("numeric", 2),
             rep("character", 3),
             rep("numeric", 6),
             rep("character", 3),
             rep("numeric", 3),
             rep("character", 2),
             rep("numeric"))

train$SalePrice <- log(train$SalePrice) # To respect lrmse
train_visu <- as.data.frame(train)

for (i in 1:80) {
  if (typeof(train_visu[, i]) == "character") {
    train_visu[is.na(train_visu[, i]), i] <- ""
    train_visu[, i] <- as.factor(train_visu[, i])
  }
}

for (i in 1:16) {
  plot(tableplot(train_visu, select = c(((i - 1) * 5 + 1):(i * 5), 81), sortCol = 6, nBins = 73, plot = FALSE), fontsize = 12, title = paste("log(SalePrice) vs ", paste(colnames(train_visu)[((i - 1) * 5 + 1):(i * 5)], collapse = "+"), sep = ""), showTitle = TRUE, fontsize.title = 12)
}




Boruta

Thanks to Jim Thompson (JMT5802) for this Boruta Feature Importance Analysis. This report determines what features may be relevant to predicting house sale price. This analysis is based on the Boruta package. The code can be found here.

ID.VAR <- "Id"
TARGET.VAR <- "SalePrice"

# Data Preparation for Bourta Analysis
# retrive data for analysis
sample.df <- read.csv(file.path(ROOT.DIR,"input/train.csv"),stringsAsFactors = FALSE)
# extract only candidate feture names
candidate.features <- setdiff(names(sample.df),c(ID.VAR,TARGET.VAR))
data.type <- sapply(candidate.features,function(x){class(sample.df[[x]])})
# deterimine data types
explanatory.attributes <- setdiff(names(sample.df),c(ID.VAR,TARGET.VAR))
data.classes <- sapply(explanatory.attributes,function(x){class(sample.df[[x]])})
# categorize data types in the data set?
unique.classes <- unique(data.classes)
attr.data.types <- lapply(unique.classes,function(x){names(data.classes[data.classes==x])})
names(attr.data.types) <- unique.classes

#Prepare data set for Boruta analysis.  For this analysis, missing values are
#handled as follows:
#* missing numeric data is set to -1
#* missing character data is set to __*MISSING*__

# pull out the response variable
response <- sample.df$SalePrice

# remove identifier and response variables
sample.df <- sample.df[candidate.features]

# for numeric set missing values to -1 for purposes of the random forest run
for (x in attr.data.types$integer){
  sample.df[[x]][is.na(sample.df[[x]])] <- -1
}

for (x in attr.data.types$character){
  sample.df[[x]][is.na(sample.df[[x]])] <- "*MISSING*"
}

# Run Boruta Analysis
set.seed(13)
bor.results <- Boruta(sample.df,response,
                   maxRuns=101,
                   doTrace=0)
cat("\nSummary of Boruta run:\n")
print(bor.results)

cat("\n\nRelevant Attributes:\n")
getSelectedAttributes(bor.results)
plot(bor.results)

#Detailed results for each candidate explanatory attributes.

cat("\n\nAttribute Importance Details:\n")
options(width=125)
arrange(cbind(attr=rownames(attStats(bor.results)), attStats(bor.results)),desc(medianImp))




Missing values

aggr(train, prop = F, numbers = T)

apply(is.na(train),2,sum)
##            Id    MSSubClass      MSZoning   LotFrontage       LotArea 
##             0             0             0           259             0 
##        Street         Alley      LotShape   LandContour     Utilities 
##             0          1369             0             0             0 
##     LotConfig     LandSlope  Neighborhood    Condition1    Condition2 
##             0             0             0             0             0 
##      BldgType    HouseStyle   OverallQual   OverallCond     YearBuilt 
##             0             0             0             0             0 
##  YearRemodAdd     RoofStyle      RoofMatl   Exterior1st   Exterior2nd 
##             0             0             0             0             0 
##    MasVnrType    MasVnrArea     ExterQual     ExterCond    Foundation 
##             8             8             0             0             0 
##      BsmtQual      BsmtCond  BsmtExposure  BsmtFinType1    BsmtFinSF1 
##            37            37            38            37             0 
##  BsmtFinType2    BsmtFinSF2     BsmtUnfSF   TotalBsmtSF       Heating 
##            38             0             0             0             0 
##     HeatingQC    CentralAir    Electrical     X1stFlrSF     X2ndFlrSF 
##             0             0             1             0             0 
##  LowQualFinSF     GrLivArea  BsmtFullBath  BsmtHalfBath      FullBath 
##             0             0             0             0             0 
##      HalfBath  BedroomAbvGr  KitchenAbvGr   KitchenQual  TotRmsAbvGrd 
##             0             0             0             0             0 
##    Functional    Fireplaces   FireplaceQu    GarageType   GarageYrBlt 
##             0             0           690            81            81 
##  GarageFinish    GarageCars    GarageArea    GarageQual    GarageCond 
##            81             0             0            81            81 
##    PavedDrive    WoodDeckSF   OpenPorchSF EnclosedPorch    X3SsnPorch 
##             0             0             0             0             0 
##   ScreenPorch      PoolArea        PoolQC         Fence   MiscFeature 
##             0             0          1453          1179          1406 
##       MiscVal        MoSold        YrSold      SaleType SaleCondition 
##             0             0             0             0             0 
##     SalePrice 
##             0



Data Preparation


The goal here is to select the most relevant features, reshape them, handle missing values and outliers and get data ready to be processed by different machine learning models.


Feature selection

# 1. Incorporate results of Boruta analysis
Boruta_analysis <- c("MSSubClass","MSZoning","LotArea","LotShape","LandContour","Neighborhood",
                    "BldgType","HouseStyle","OverallQual","OverallCond","YearBuilt",
                    "YearRemodAdd","Exterior1st","Exterior2nd","MasVnrArea","ExterQual",
                    "Foundation","BsmtQual","BsmtCond","BsmtFinType1","BsmtFinSF1",
                    "BsmtFinType2","BsmtUnfSF","TotalBsmtSF","HeatingQC","CentralAir",
                    "X1stFlrSF","X2ndFlrSF","GrLivArea","BsmtFullBath","FullBath","HalfBath",
                    "BedroomAbvGr","KitchenAbvGr","KitchenQual","TotRmsAbvGrd","Functional",
                    "Fireplaces","FireplaceQu","GarageType","GarageYrBlt","GarageFinish",
                    "GarageCars","GarageArea","GarageQual","GarageCond","PavedDrive","WoodDeckSF",
                    "OpenPorchSF","Fence", "SalePrice")

train_selected_boruta <- train[Boruta_analysis]

# Identify near zero variance predictors: remove_cols
remove_cols <- nearZeroVar(train_selected_boruta, names = TRUE, 
                           freqCut = 2, uniqueCut = 20)

# Remove predictors with low variance 
all_cols <- names(train_selected_boruta)
train_selected <- train_selected_boruta[ , setdiff(all_cols, remove_cols)]



Types

# transform all the charaters variable into factor
train_selected[sapply(train_selected, is.character)] <- lapply(train_selected[sapply(train_selected, is.character)], as.factor)



Outliers

train_selected_outliers <- train_selected
# remove outliers
train_selected <- subset(train_selected,!(train_selected$SalePrice > quantile(train_selected$SalePrice, probs=c(.01, .99))[2] | train_selected$SalePrice < quantile(train_selected$SalePrice, probs=c(.01, .9))[1]) ) 

par(mfrow=c(1,2))
boxplot(train_selected_outliers$SalePrice, main="Before")
boxplot(train_selected$SalePrice, main="After")




Missing values

Heuristics or rules of thumb

  • Numeric variables: PMM (Predictive Mean Matching)
  • For Binary Variables( with 2 levels): logreg(Logistic Regression)
  • For Factor Variables (>= 2 levels): polyreg(Bayesian polytomous regression)

As we keep this model simple we just use ppm.

# missingvaluenumeric <- MasVnrArea, GarageYrBlt
# missingvaluefactor <- c('BsmtQual', 'BsmtFinType1', 'FireplaceQu', 'GarageFinish') 
## FireplaceQu miss almost 50% of value!! 
tempData <- mice(train_selected,m=5,maxit=50,meth='pmm',seed=500)
## 
##  iter imp variable
##   1   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   1   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   1   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   1   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   1   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   2   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   2   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   2   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   2   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   2   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   3   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   3   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   3   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   3   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   3   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   4   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   4   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   4   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   4   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   4   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   5   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   5   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   5   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   5   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   5   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   6   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   6   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   6   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   6   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   6   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   7   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   7   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   7   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   7   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   7   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   8   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   8   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   8   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   8   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   8   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   9   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   9   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   9   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   9   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   9   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   10   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   10   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   10   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   10   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   10   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   11   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   11   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   11   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   11   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   11   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   12   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   12   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   12   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   12   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   12   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   13   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   13   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   13   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   13   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   13   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   14   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   14   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   14   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   14   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   14   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   15   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   15   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   15   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   15   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   15   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   16   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   16   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   16   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   16   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   16   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   17   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   17   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   17   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   17   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   17   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   18   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   18   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   18   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   18   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   18   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   19   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   19   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   19   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   19   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   19   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   20   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   20   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   20   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   20   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   20   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   21   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   21   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   21   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   21   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   21   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   22   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   22   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   22   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   22   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   22   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   23   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   23   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   23   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   23   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   23   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   24   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   24   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   24   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   24   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   24   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   25   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   25   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   25   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   25   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   25   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   26   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   26   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   26   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   26   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   26   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   27   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   27   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   27   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   27   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   27   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   28   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   28   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   28   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   28   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   28   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   29   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   29   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   29   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   29   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   29   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   30   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   30   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   30   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   30   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   30   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   31   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   31   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   31   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   31   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   31   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   32   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   32   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   32   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   32   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   32   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   33   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   33   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   33   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   33   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   33   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   34   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   34   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   34   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   34   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   34   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   35   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   35   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   35   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   35   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   35   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   36   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   36   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   36   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   36   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   36   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   37   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   37   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   37   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   37   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   37   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   38   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   38   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   38   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   38   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   38   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   39   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   39   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   39   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   39   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   39   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   40   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   40   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   40   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   40   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   40   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   41   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   41   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   41   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   41   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   41   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   42   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   42   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   42   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   42   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   42   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   43   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   43   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   43   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   43   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   43   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   44   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   44   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   44   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   44   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   44   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   45   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   45   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   45   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   45   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   45   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   46   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   46   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   46   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   46   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   46   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   47   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   47   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   47   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   47   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   47   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   48   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   48   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   48   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   48   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   48   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   49   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   49   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   49   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   49   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   49   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   50   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   50   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   50   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   50   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   50   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
train_selected_reform <- complete(tempData,1)
apply(is.na(train_selected_reform),2,sum)
##   MSSubClass      LotArea     LotShape Neighborhood   HouseStyle 
##            0            0            0            0            0 
##  OverallQual    YearBuilt YearRemodAdd   MasVnrArea    ExterQual 
##            0            0            0            0            0 
##   Foundation     BsmtQual BsmtFinType1   BsmtFinSF1    BsmtUnfSF 
##            0            0            0            0            0 
##  TotalBsmtSF    HeatingQC    X1stFlrSF    X2ndFlrSF    GrLivArea 
##            0            0            0            0            0 
## BsmtFullBath     FullBath     HalfBath  KitchenQual TotRmsAbvGrd 
##            0            0            0            0            0 
##   Fireplaces  FireplaceQu  GarageYrBlt GarageFinish   GarageArea 
##            0            0            0            0            0 
##    SalePrice 
##            0



Modeling


We now build and evaluate two simple regression models that will need to be futher tuned.

Generalized Boosted Regression Models (Gbm)

### 1
# Train on cross-validation
train_control<- trainControl(method="cv", number=8, repeats=5)

# Build the Generalized Boosted Regression Models
gbm <- train(SalePrice~., data=train_selected_reform, trControl=train_control, method="gbm")
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1221             nan     0.1000    0.0124
##      2        0.1124             nan     0.1000    0.0099
##      3        0.1042             nan     0.1000    0.0080
##      4        0.0964             nan     0.1000    0.0073
##      5        0.0905             nan     0.1000    0.0061
##      6        0.0848             nan     0.1000    0.0053
##      7        0.0803             nan     0.1000    0.0045
##      8        0.0755             nan     0.1000    0.0051
##      9        0.0712             nan     0.1000    0.0041
##     10        0.0679             nan     0.1000    0.0034
##     20        0.0444             nan     0.1000    0.0015
##     40        0.0276             nan     0.1000    0.0004
##     60        0.0216             nan     0.1000    0.0002
##     80        0.0188             nan     0.1000    0.0001
##    100        0.0174             nan     0.1000    0.0000
##    120        0.0166             nan     0.1000   -0.0000
##    140        0.0160             nan     0.1000    0.0000
##    150        0.0158             nan     0.1000    0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1199             nan     0.1000    0.0143
##      2        0.1079             nan     0.1000    0.0117
##      3        0.0977             nan     0.1000    0.0101
##      4        0.0890             nan     0.1000    0.0084
##      5        0.0819             nan     0.1000    0.0067
##      6        0.0757             nan     0.1000    0.0057
##      7        0.0701             nan     0.1000    0.0054
##      8        0.0649             nan     0.1000    0.0048
##      9        0.0603             nan     0.1000    0.0040
##     10        0.0558             nan     0.1000    0.0039
##     20        0.0331             nan     0.1000    0.0012
##     40        0.0199             nan     0.1000    0.0003
##     60        0.0162             nan     0.1000    0.0000
##     80        0.0148             nan     0.1000    0.0000
##    100        0.0140             nan     0.1000   -0.0000
##    120        0.0132             nan     0.1000   -0.0000
##    140        0.0127             nan     0.1000   -0.0000
##    150        0.0124             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1177             nan     0.1000    0.0158
##      2        0.1047             nan     0.1000    0.0128
##      3        0.0939             nan     0.1000    0.0097
##      4        0.0846             nan     0.1000    0.0089
##      5        0.0772             nan     0.1000    0.0075
##      6        0.0706             nan     0.1000    0.0065
##      7        0.0647             nan     0.1000    0.0059
##      8        0.0595             nan     0.1000    0.0049
##      9        0.0545             nan     0.1000    0.0048
##     10        0.0501             nan     0.1000    0.0039
##     20        0.0276             nan     0.1000    0.0010
##     40        0.0163             nan     0.1000    0.0001
##     60        0.0138             nan     0.1000   -0.0000
##     80        0.0126             nan     0.1000   -0.0000
##    100        0.0118             nan     0.1000   -0.0000
##    120        0.0111             nan     0.1000   -0.0000
##    140        0.0105             nan     0.1000   -0.0000
##    150        0.0103             nan     0.1000   -0.0001
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1196             nan     0.1000    0.0119
##      2        0.1099             nan     0.1000    0.0095
##      3        0.1027             nan     0.1000    0.0077
##      4        0.0954             nan     0.1000    0.0071
##      5        0.0890             nan     0.1000    0.0053
##      6        0.0837             nan     0.1000    0.0050
##      7        0.0784             nan     0.1000    0.0049
##      8        0.0742             nan     0.1000    0.0042
##      9        0.0701             nan     0.1000    0.0036
##     10        0.0661             nan     0.1000    0.0038
##     20        0.0434             nan     0.1000    0.0010
##     40        0.0268             nan     0.1000    0.0004
##     60        0.0210             nan     0.1000    0.0001
##     80        0.0183             nan     0.1000   -0.0000
##    100        0.0169             nan     0.1000    0.0000
##    120        0.0160             nan     0.1000    0.0000
##    140        0.0154             nan     0.1000   -0.0000
##    150        0.0153             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1180             nan     0.1000    0.0139
##      2        0.1062             nan     0.1000    0.0108
##      3        0.0968             nan     0.1000    0.0085
##      4        0.0889             nan     0.1000    0.0081
##      5        0.0817             nan     0.1000    0.0074
##      6        0.0747             nan     0.1000    0.0066
##      7        0.0689             nan     0.1000    0.0055
##      8        0.0645             nan     0.1000    0.0040
##      9        0.0598             nan     0.1000    0.0039
##     10        0.0561             nan     0.1000    0.0036
##     20        0.0331             nan     0.1000    0.0012
##     40        0.0198             nan     0.1000    0.0002
##     60        0.0160             nan     0.1000    0.0000
##     80        0.0145             nan     0.1000    0.0000
##    100        0.0137             nan     0.1000    0.0000
##    120        0.0131             nan     0.1000   -0.0000
##    140        0.0126             nan     0.1000   -0.0000
##    150        0.0123             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1153             nan     0.1000    0.0151
##      2        0.1022             nan     0.1000    0.0130
##      3        0.0918             nan     0.1000    0.0101
##      4        0.0823             nan     0.1000    0.0085
##      5        0.0746             nan     0.1000    0.0068
##      6        0.0680             nan     0.1000    0.0056
##      7        0.0624             nan     0.1000    0.0054
##      8        0.0574             nan     0.1000    0.0048
##      9        0.0530             nan     0.1000    0.0042
##     10        0.0487             nan     0.1000    0.0037
##     20        0.0275             nan     0.1000    0.0011
##     40        0.0167             nan     0.1000    0.0001
##     60        0.0140             nan     0.1000    0.0000
##     80        0.0129             nan     0.1000   -0.0000
##    100        0.0119             nan     0.1000   -0.0000
##    120        0.0113             nan     0.1000   -0.0000
##    140        0.0107             nan     0.1000   -0.0000
##    150        0.0104             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1199             nan     0.1000    0.0121
##      2        0.1102             nan     0.1000    0.0098
##      3        0.1023             nan     0.1000    0.0080
##      4        0.0954             nan     0.1000    0.0069
##      5        0.0894             nan     0.1000    0.0056
##      6        0.0840             nan     0.1000    0.0053
##      7        0.0790             nan     0.1000    0.0044
##      8        0.0750             nan     0.1000    0.0039
##      9        0.0710             nan     0.1000    0.0039
##     10        0.0674             nan     0.1000    0.0034
##     20        0.0436             nan     0.1000    0.0014
##     40        0.0274             nan     0.1000    0.0004
##     60        0.0213             nan     0.1000    0.0001
##     80        0.0186             nan     0.1000    0.0001
##    100        0.0171             nan     0.1000    0.0000
##    120        0.0163             nan     0.1000   -0.0001
##    140        0.0156             nan     0.1000    0.0000
##    150        0.0154             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1191             nan     0.1000    0.0141
##      2        0.1076             nan     0.1000    0.0120
##      3        0.0981             nan     0.1000    0.0098
##      4        0.0896             nan     0.1000    0.0085
##      5        0.0811             nan     0.1000    0.0077
##      6        0.0751             nan     0.1000    0.0062
##      7        0.0694             nan     0.1000    0.0054
##      8        0.0647             nan     0.1000    0.0047
##      9        0.0599             nan     0.1000    0.0049
##     10        0.0560             nan     0.1000    0.0039
##     20        0.0330             nan     0.1000    0.0011
##     40        0.0197             nan     0.1000    0.0001
##     60        0.0162             nan     0.1000   -0.0000
##     80        0.0146             nan     0.1000   -0.0000
##    100        0.0137             nan     0.1000    0.0000
##    120        0.0131             nan     0.1000   -0.0000
##    140        0.0125             nan     0.1000   -0.0000
##    150        0.0123             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1167             nan     0.1000    0.0155
##      2        0.1039             nan     0.1000    0.0134
##      3        0.0934             nan     0.1000    0.0105
##      4        0.0843             nan     0.1000    0.0091
##      5        0.0768             nan     0.1000    0.0070
##      6        0.0704             nan     0.1000    0.0060
##      7        0.0639             nan     0.1000    0.0064
##      8        0.0585             nan     0.1000    0.0049
##      9        0.0540             nan     0.1000    0.0043
##     10        0.0500             nan     0.1000    0.0039
##     20        0.0272             nan     0.1000    0.0012
##     40        0.0166             nan     0.1000    0.0002
##     60        0.0140             nan     0.1000   -0.0000
##     80        0.0129             nan     0.1000   -0.0001
##    100        0.0120             nan     0.1000   -0.0001
##    120        0.0113             nan     0.1000   -0.0000
##    140        0.0107             nan     0.1000   -0.0000
##    150        0.0105             nan     0.1000    0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1201             nan     0.1000    0.0120
##      2        0.1101             nan     0.1000    0.0100
##      3        0.1023             nan     0.1000    0.0078
##      4        0.0951             nan     0.1000    0.0072
##      5        0.0888             nan     0.1000    0.0056
##      6        0.0834             nan     0.1000    0.0052
##      7        0.0787             nan     0.1000    0.0042
##      8        0.0743             nan     0.1000    0.0039
##      9        0.0703             nan     0.1000    0.0037
##     10        0.0667             nan     0.1000    0.0035
##     20        0.0441             nan     0.1000    0.0013
##     40        0.0281             nan     0.1000    0.0005
##     60        0.0221             nan     0.1000    0.0001
##     80        0.0193             nan     0.1000    0.0001
##    100        0.0179             nan     0.1000    0.0000
##    120        0.0171             nan     0.1000   -0.0000
##    140        0.0165             nan     0.1000   -0.0000
##    150        0.0162             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1174             nan     0.1000    0.0136
##      2        0.1054             nan     0.1000    0.0119
##      3        0.0959             nan     0.1000    0.0093
##      4        0.0881             nan     0.1000    0.0075
##      5        0.0811             nan     0.1000    0.0063
##      6        0.0748             nan     0.1000    0.0060
##      7        0.0693             nan     0.1000    0.0053
##      8        0.0641             nan     0.1000    0.0053
##      9        0.0596             nan     0.1000    0.0046
##     10        0.0562             nan     0.1000    0.0033
##     20        0.0338             nan     0.1000    0.0011
##     40        0.0208             nan     0.1000    0.0003
##     60        0.0172             nan     0.1000    0.0000
##     80        0.0157             nan     0.1000   -0.0000
##    100        0.0148             nan     0.1000   -0.0000
##    120        0.0140             nan     0.1000   -0.0000
##    140        0.0134             nan     0.1000    0.0000
##    150        0.0131             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1170             nan     0.1000    0.0165
##      2        0.1043             nan     0.1000    0.0129
##      3        0.0932             nan     0.1000    0.0102
##      4        0.0848             nan     0.1000    0.0082
##      5        0.0768             nan     0.1000    0.0080
##      6        0.0705             nan     0.1000    0.0060
##      7        0.0640             nan     0.1000    0.0062
##      8        0.0590             nan     0.1000    0.0047
##      9        0.0543             nan     0.1000    0.0041
##     10        0.0502             nan     0.1000    0.0038
##     20        0.0282             nan     0.1000    0.0010
##     40        0.0172             nan     0.1000    0.0001
##     60        0.0147             nan     0.1000   -0.0000
##     80        0.0133             nan     0.1000   -0.0000
##    100        0.0124             nan     0.1000   -0.0000
##    120        0.0117             nan     0.1000   -0.0001
##    140        0.0110             nan     0.1000   -0.0000
##    150        0.0108             nan     0.1000   -0.0001
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1184             nan     0.1000    0.0118
##      2        0.1087             nan     0.1000    0.0095
##      3        0.1011             nan     0.1000    0.0078
##      4        0.0940             nan     0.1000    0.0070
##      5        0.0887             nan     0.1000    0.0055
##      6        0.0831             nan     0.1000    0.0053
##      7        0.0787             nan     0.1000    0.0046
##      8        0.0747             nan     0.1000    0.0038
##      9        0.0707             nan     0.1000    0.0035
##     10        0.0669             nan     0.1000    0.0034
##     20        0.0441             nan     0.1000    0.0014
##     40        0.0277             nan     0.1000    0.0002
##     60        0.0214             nan     0.1000    0.0001
##     80        0.0186             nan     0.1000    0.0001
##    100        0.0170             nan     0.1000    0.0000
##    120        0.0162             nan     0.1000   -0.0000
##    140        0.0155             nan     0.1000   -0.0001
##    150        0.0153             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1164             nan     0.1000    0.0140
##      2        0.1052             nan     0.1000    0.0107
##      3        0.0958             nan     0.1000    0.0089
##      4        0.0879             nan     0.1000    0.0071
##      5        0.0809             nan     0.1000    0.0066
##      6        0.0747             nan     0.1000    0.0059
##      7        0.0693             nan     0.1000    0.0050
##      8        0.0647             nan     0.1000    0.0038
##      9        0.0599             nan     0.1000    0.0043
##     10        0.0560             nan     0.1000    0.0035
##     20        0.0335             nan     0.1000    0.0012
##     40        0.0202             nan     0.1000    0.0002
##     60        0.0163             nan     0.1000    0.0001
##     80        0.0148             nan     0.1000    0.0000
##    100        0.0138             nan     0.1000   -0.0000
##    120        0.0131             nan     0.1000    0.0000
##    140        0.0125             nan     0.1000   -0.0000
##    150        0.0122             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1149             nan     0.1000    0.0157
##      2        0.1018             nan     0.1000    0.0133
##      3        0.0915             nan     0.1000    0.0101
##      4        0.0826             nan     0.1000    0.0088
##      5        0.0752             nan     0.1000    0.0071
##      6        0.0687             nan     0.1000    0.0064
##      7        0.0629             nan     0.1000    0.0054
##      8        0.0577             nan     0.1000    0.0053
##      9        0.0535             nan     0.1000    0.0036
##     10        0.0495             nan     0.1000    0.0036
##     20        0.0284             nan     0.1000    0.0012
##     40        0.0169             nan     0.1000    0.0001
##     60        0.0142             nan     0.1000   -0.0000
##     80        0.0129             nan     0.1000   -0.0000
##    100        0.0119             nan     0.1000   -0.0000
##    120        0.0112             nan     0.1000   -0.0000
##    140        0.0107             nan     0.1000   -0.0000
##    150        0.0104             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1193             nan     0.1000    0.0119
##      2        0.1098             nan     0.1000    0.0097
##      3        0.1022             nan     0.1000    0.0076
##      4        0.0952             nan     0.1000    0.0066
##      5        0.0885             nan     0.1000    0.0064
##      6        0.0833             nan     0.1000    0.0050
##      7        0.0785             nan     0.1000    0.0047
##      8        0.0743             nan     0.1000    0.0040
##      9        0.0704             nan     0.1000    0.0040
##     10        0.0668             nan     0.1000    0.0034
##     20        0.0441             nan     0.1000    0.0014
##     40        0.0267             nan     0.1000    0.0004
##     60        0.0208             nan     0.1000    0.0002
##     80        0.0182             nan     0.1000    0.0000
##    100        0.0168             nan     0.1000   -0.0000
##    120        0.0160             nan     0.1000   -0.0000
##    140        0.0154             nan     0.1000    0.0000
##    150        0.0151             nan     0.1000    0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1175             nan     0.1000    0.0137
##      2        0.1063             nan     0.1000    0.0110
##      3        0.0967             nan     0.1000    0.0097
##      4        0.0886             nan     0.1000    0.0078
##      5        0.0809             nan     0.1000    0.0075
##      6        0.0750             nan     0.1000    0.0054
##      7        0.0691             nan     0.1000    0.0058
##      8        0.0644             nan     0.1000    0.0043
##      9        0.0601             nan     0.1000    0.0041
##     10        0.0563             nan     0.1000    0.0037
##     20        0.0324             nan     0.1000    0.0011
##     40        0.0194             nan     0.1000    0.0002
##     60        0.0159             nan     0.1000    0.0000
##     80        0.0144             nan     0.1000    0.0000
##    100        0.0134             nan     0.1000   -0.0000
##    120        0.0127             nan     0.1000   -0.0001
##    140        0.0121             nan     0.1000   -0.0000
##    150        0.0119             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1162             nan     0.1000    0.0150
##      2        0.1041             nan     0.1000    0.0123
##      3        0.0930             nan     0.1000    0.0110
##      4        0.0839             nan     0.1000    0.0089
##      5        0.0760             nan     0.1000    0.0073
##      6        0.0695             nan     0.1000    0.0064
##      7        0.0631             nan     0.1000    0.0060
##      8        0.0581             nan     0.1000    0.0049
##      9        0.0537             nan     0.1000    0.0043
##     10        0.0496             nan     0.1000    0.0040
##     20        0.0277             nan     0.1000    0.0011
##     40        0.0165             nan     0.1000    0.0002
##     60        0.0138             nan     0.1000    0.0000
##     80        0.0125             nan     0.1000   -0.0001
##    100        0.0116             nan     0.1000   -0.0001
##    120        0.0110             nan     0.1000   -0.0000
##    140        0.0104             nan     0.1000   -0.0000
##    150        0.0101             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1208             nan     0.1000    0.0118
##      2        0.1113             nan     0.1000    0.0093
##      3        0.1032             nan     0.1000    0.0076
##      4        0.0966             nan     0.1000    0.0063
##      5        0.0904             nan     0.1000    0.0062
##      6        0.0844             nan     0.1000    0.0055
##      7        0.0790             nan     0.1000    0.0052
##      8        0.0743             nan     0.1000    0.0042
##      9        0.0710             nan     0.1000    0.0029
##     10        0.0678             nan     0.1000    0.0029
##     20        0.0442             nan     0.1000    0.0015
##     40        0.0273             nan     0.1000    0.0004
##     60        0.0211             nan     0.1000    0.0002
##     80        0.0181             nan     0.1000    0.0000
##    100        0.0165             nan     0.1000    0.0000
##    120        0.0156             nan     0.1000    0.0000
##    140        0.0151             nan     0.1000   -0.0000
##    150        0.0148             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1188             nan     0.1000    0.0133
##      2        0.1075             nan     0.1000    0.0113
##      3        0.0975             nan     0.1000    0.0086
##      4        0.0883             nan     0.1000    0.0089
##      5        0.0814             nan     0.1000    0.0065
##      6        0.0754             nan     0.1000    0.0059
##      7        0.0700             nan     0.1000    0.0052
##      8        0.0653             nan     0.1000    0.0047
##      9        0.0605             nan     0.1000    0.0047
##     10        0.0567             nan     0.1000    0.0036
##     20        0.0335             nan     0.1000    0.0012
##     40        0.0194             nan     0.1000    0.0003
##     60        0.0156             nan     0.1000    0.0001
##     80        0.0140             nan     0.1000    0.0000
##    100        0.0130             nan     0.1000   -0.0000
##    120        0.0124             nan     0.1000   -0.0000
##    140        0.0119             nan     0.1000   -0.0000
##    150        0.0116             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1169             nan     0.1000    0.0150
##      2        0.1038             nan     0.1000    0.0131
##      3        0.0931             nan     0.1000    0.0107
##      4        0.0840             nan     0.1000    0.0090
##      5        0.0762             nan     0.1000    0.0073
##      6        0.0699             nan     0.1000    0.0059
##      7        0.0638             nan     0.1000    0.0060
##      8        0.0585             nan     0.1000    0.0053
##      9        0.0538             nan     0.1000    0.0042
##     10        0.0496             nan     0.1000    0.0038
##     20        0.0274             nan     0.1000    0.0012
##     40        0.0162             nan     0.1000    0.0001
##     60        0.0135             nan     0.1000    0.0000
##     80        0.0123             nan     0.1000   -0.0001
##    100        0.0115             nan     0.1000   -0.0001
##    120        0.0108             nan     0.1000    0.0000
##    140        0.0103             nan     0.1000   -0.0000
##    150        0.0100             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1191             nan     0.1000    0.0121
##      2        0.1093             nan     0.1000    0.0096
##      3        0.1016             nan     0.1000    0.0077
##      4        0.0940             nan     0.1000    0.0072
##      5        0.0879             nan     0.1000    0.0056
##      6        0.0825             nan     0.1000    0.0051
##      7        0.0777             nan     0.1000    0.0045
##      8        0.0731             nan     0.1000    0.0042
##      9        0.0693             nan     0.1000    0.0039
##     10        0.0659             nan     0.1000    0.0031
##     20        0.0430             nan     0.1000    0.0013
##     40        0.0265             nan     0.1000    0.0004
##     60        0.0204             nan     0.1000    0.0002
##     80        0.0177             nan     0.1000    0.0000
##    100        0.0163             nan     0.1000    0.0000
##    120        0.0154             nan     0.1000    0.0000
##    140        0.0148             nan     0.1000   -0.0000
##    150        0.0145             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1175             nan     0.1000    0.0143
##      2        0.1056             nan     0.1000    0.0112
##      3        0.0964             nan     0.1000    0.0095
##      4        0.0878             nan     0.1000    0.0079
##      5        0.0800             nan     0.1000    0.0075
##      6        0.0733             nan     0.1000    0.0064
##      7        0.0675             nan     0.1000    0.0052
##      8        0.0625             nan     0.1000    0.0046
##      9        0.0581             nan     0.1000    0.0042
##     10        0.0544             nan     0.1000    0.0036
##     20        0.0325             nan     0.1000    0.0011
##     40        0.0195             nan     0.1000    0.0002
##     60        0.0157             nan     0.1000    0.0001
##     80        0.0141             nan     0.1000   -0.0001
##    100        0.0131             nan     0.1000   -0.0000
##    120        0.0124             nan     0.1000   -0.0000
##    140        0.0118             nan     0.1000   -0.0001
##    150        0.0116             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1159             nan     0.1000    0.0151
##      2        0.1033             nan     0.1000    0.0121
##      3        0.0928             nan     0.1000    0.0107
##      4        0.0837             nan     0.1000    0.0092
##      5        0.0760             nan     0.1000    0.0072
##      6        0.0684             nan     0.1000    0.0073
##      7        0.0626             nan     0.1000    0.0057
##      8        0.0574             nan     0.1000    0.0044
##      9        0.0526             nan     0.1000    0.0046
##     10        0.0484             nan     0.1000    0.0038
##     20        0.0267             nan     0.1000    0.0011
##     40        0.0158             nan     0.1000    0.0001
##     60        0.0129             nan     0.1000   -0.0000
##     80        0.0117             nan     0.1000   -0.0000
##    100        0.0110             nan     0.1000   -0.0001
##    120        0.0104             nan     0.1000   -0.0000
##    140        0.0098             nan     0.1000   -0.0000
##    150        0.0095             nan     0.1000   -0.0001
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1165             nan     0.1000    0.0149
##      2        0.1036             nan     0.1000    0.0125
##      3        0.0930             nan     0.1000    0.0105
##      4        0.0842             nan     0.1000    0.0080
##      5        0.0764             nan     0.1000    0.0078
##      6        0.0698             nan     0.1000    0.0062
##      7        0.0637             nan     0.1000    0.0060
##      8        0.0588             nan     0.1000    0.0048
##      9        0.0541             nan     0.1000    0.0042
##     10        0.0501             nan     0.1000    0.0036
##     20        0.0285             nan     0.1000    0.0012
##     40        0.0172             nan     0.1000    0.0002
##     60        0.0143             nan     0.1000   -0.0000
##     80        0.0131             nan     0.1000   -0.0000
##    100        0.0121             nan     0.1000   -0.0000
##    120        0.0114             nan     0.1000   -0.0000
##    140        0.0108             nan     0.1000   -0.0000
##    150        0.0106             nan     0.1000   -0.0000
# make prediction on the train data
prediction <- predict(gbm, train_selected_reform)
binded <- cbind(train_selected_reform, prediction)

Calculate the rmse

res <- binded$SalePrice - prediction
rmse <- sqrt(mean(res ^ 2))
print(rmse)
## [1] 0.1027929

Example: this code isn’t ran (it’s just an ideas on how this algorithm could be futher tuned)

library(hydroGOF)
library(Metrics)

caretGrid <- expand.grid(interaction.depth=c(1, 3, 5), n.trees = (0:50)*50,
                   shrinkage=c(0.01, 0.001),
                   n.minobsinnode=10)
metric <- "RMSE"

set.seed(99)
gbm.caret <- train(SalePrice ~ ., data=train_selected_reform, method="gbm",
              trControl=train_control, verbose=FALSE, 
              tuneGrid=caretGrid, metric=metric, bag.fraction=0.75)

print(gbm.caret)
Find the data and the code on Kaggle or on Github



Extreme Gradient Boosting (Xgboost)

xgbTree <- train(SalePrice~., data=train_selected_reform, trControl=train_control, method="xgbTree")
# make prediction on the train data
prediction <- predict(xgbTree, train_selected_reform)
binded <- cbind(train_selected_reform, prediction)

Calculate the rmse

res <- binded$SalePrice - prediction
rmse <- sqrt(mean(res ^ 2))
print(rmse)
## [1] 0.08545178

*learn more about how to tune a Xgboost here https://github.com/topepo/caret/blob/master/RegressionTests/Code/xgbTree.R




Find the data and the code on Kaggle or on Github